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CONVEX HULLS OF UNIFORM SAMPLES FROM A CONVEX POLYGON 

PIET GROENEBOOM,* Delft University of Technology 

Abstract 

In Groeneboom (1988) a central limit theorem for the number of vertices N n 
of the convex hull of a uniform sample from the interior of convex polygon was 
derived. To be more precise, it was shown that {N n — |r log n}/{i|r log n} 1 ^ 2 
converges in law to a standard normal distribution, if r is the number of vertices 
of the convex polygon from which the sample is taken. 

In the unpublished preprint Nagaev and Khamdamov (1991) a central limit 
result for the joint distribution of N n and A n is given, where A n is the area 
of the convex hull, using a coupling of the sample process near the border 
of the polygon with a Poisson point process as in Groeneboom (1988), and 
representing the remaining area in the Poisson approximation as a union of a 
doubly infinite sequence of independent standard exponential random variables. 
We derive this representation from the representation in Groeneboom (1988) 
and also prove the central limit result of Nagaev and Khamdamov (1991), 
using this representation. The relation between the variances of the asymptotic 
normal distributions of number of vertices and the area, established in Nagaev 
and Khamdamov (1991), corresponds to a relation between the actual sample 
variances of N n and A„ in Buchta (2005). We show how these asymptotic 
results all follow from one simple guiding principle. This corrects at the same 
time the scaling constants in Cabo and Groeneboom (1994) and Nagaev (1995). 
Keywords: convex hulls 
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1. Introduction 

Let N n be the number of vertices of the convex hull of a sample of size n, drawn uniformly from 
the interior of a convex polygon with r vertices. It was shown in Groeneboom (1988) that 

{N n - f r log n}/{§r log n} 1 ' 2 A Af(0, 1), 
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where jV(0, 1) denotes the standard normal distribution. This was proved by coupling the sample 
point process near the boundary of the convex polygon with a Poisson point process, and showing 
that the relevant part of the sample process could be approximated sufficiently closely by the coupled 
Poisson point process. The central limit result for N n was subsequently derived from a corresponding 
result for the boundary of the convex hull of the approximating Poisson point process. These 
methods were also applied to the area A n of the convex hull in Cabo and Groeneboom (1994), but 
unfortunately the central limit result A n contained a scaling error (see Remark 3.2). 

Nagaev and Khamdamov (1991), using the coupling of (part of the) sample point process with 
a Poisson process introduced in Groeneboom (1988), derived the following interesting central limit 
theorem for the joint distribution of the number of vertices and the area of the convex hull of a 
uniform sample of n points on the interior of a convex polygon. 

Theorem 1.1. (Theorem 1 of Nagaev and Khamdamov (1991)) Let N n denote the number of ver- 
tices of the convex hull of a uniform sample of size n from the interior of a convex polygon C with 
r > 3 vertices and area A(C). Moreover, let A n denote the area of the convex hull of the sample, 
and let the scaled "remaining area" A n be defined by 

A n = n{A(C)-A n }/A(C) 

Then 

(ifrlogn)" 1/2 (7V„-|rlogn,A n -|rlogn) AjV(0,E), (1.1) 

where Af(0, S) denotes the normal distribution with expectation the zero vector and covariance matrix 
£ given by 




This is an extension of the central limit theorem for the number of vertices N n in Groeneboom 
(1988), and one indeed recovers the central limit theorem given there by specializing the above result 
to the first coordinate. Unfortunately, the preprint Nagaev and Khamdamov (1991), containing this 
result, was never published. Moreover, it is written in Russian and its length is 50 pages, which 
might also not have helped its spread in the scientific world. 

In a private correspondence Christian Buchta revealed to me that the constant for the central 
limit theorem for the second component (the remaining area) in Nagaev and Khamdamov (1991) 
was consistent with a relation he had derived himself between the finite sample variances of N n and 

A n - 

It is the purpose of the present note to give a simple proof of Theorem 1.1, deriving the result 
from the central limit theorem for N n in Groeneboom (1988). We think that using the central limit 
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theorem of Groeneboom (1988) considerably simplifies the proof of Theorem 1.1 in Nagaev and 
Khamdamov (1991) and perhaps more clearly reveals the beauty of their idea. The relation between 
the variances in Theorem 1.1 can be considered to be a precursor (in an asymptotic sense) of the 
relation found between the finite sample variances in Buchta (2005). 

For recent work on central limit theorems for random polytopes, see, e.g., Barany and Reitzner 
(2010a) and Barany and Reitzner (2010b), where also references to earlier work in this area can be 
found. 



2. Representation of the remaining area by i.i.d. exponentials 



We consider the Poisson point process V of intensity 1 in R^_, and its left- lower convex hull, 
as in Groeneboom (1988). To make the connection with Groeneboom (1988), we first restate the 
definition of the process of vertices {VF(a) : a € M+} consisting of the vertices of the (left-lower) 
convex hull of a Poisson process V with intensity 1 in R+. 




(0,0) 

Figure 1: W(a)-process 



Definition 1. For each a > 0, W(a) — (U(a),V(a)) is the point of the realization of the Poisson 
process V on M? + such that all points of the realization of V lie to the right of the line of the line 
x + ay = c which passes through W(a). If there are several of such points (which happens with 
probability zero for fixed a), we define U{a) (V(a)) as the supremum (infimum) of x-coordinates 
(y-coordinates) of points of this type. 

We now have the following result (see also Theorem 2.1 of Nagaev (1995) for a result of this type). 

Theorem 2.1. Let a = 1, leta\,a2, ■ ■ ■ be the jump times of the process {W(a) : a > 1}, and let D 

be the area of the isosceles triangle T with a basis, running through W(l), and two equal sides along 
the x- and y-axis, meeting at the top at the origin. Moreover, let D i} i > 1, be the area of the triangle 
Ti, with top at W(ai-i), basis along the x-axis, and sides along the lines x + a^iy = V (ai) + ajV '(dj) 
and x + aiy = U(di) + a,F(aj), where W(ai), U(di) and V(di) are defined as in Definition 1. Then 

(i) The areas Dq, D±, . . . form an i.i.d. sequence of standard exponential random variables. 

(ii) Let Si be the length of the line segment, connecting W(ai-i) and W(ai), and let Li be the 
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length of the segment, obtained by extending the line segment from W{a,i-\) to W(a,i) until it 
crosses the x-axis. Then the random variables S 2 / Lf , i = 1,2, ... form an i.i.d. sequence of 
Uniform(0,l) random variables, independent ofW{l). Moreover, the Sf/Lf are independent 
of the sequence Do, Di, . . . 

Proof, (i). By Part (i) of Lemma 2.4 of Groeneboom (1988) we have, for z > 0, 

P{D >z} = P{^{U(l) + V(l)} 2 >z} = f , e-*Wdxdy = e-*, 

J{(x,y):±(x+ y y>zj 

showing that Do has a standard exponential distribution. Let T a denote the er-algebra, generated 
by the points {W(b), 1 < b < a}. Then, as shown in Groeneboom (1988), the process of points 
{W(a), a > 1} is a Markov process w.r.t. the filtration {T ai a > 1}. Now note that, if i > 1, Dj > z 
exactly when there are no points in the triangle of area z, with top at W{a{), basis along the cc-axis, 
and sides along the lines x + di-iy = U(di) + aiV(ai) and x + aiy = U(ai) + aiV(ai). Since this 
event is independent of the location of the points W(ao), . ■ . , W(oi_i), by the Poisson property of 
the point process in we get: 

P{D, > z) = e- z , z>0, 

where the event Di > z is independent of Do, . . . , (note that we can use the strong Markov 

property here). 

(ii). The jump measure M(a,w; ■) of the process {W(a) : a > 0} is given by 

rv 

M(a,w;B)= / u1b(ou, —u) du, (2.1) 
Jo 

see (2.22) of Groeneboom (1988). Hence, conditioning on W(a) — W(oi_i) = (x,y) and the event 
that there is a jump at time a, the location of the next vertex has a density proportional to u 
(representing the distance of W(a) to the next vertex). So we get, for z G (0, 1), 

P{S 2 JLI < z | W(a) > W(a-) = (x,y)} 
= P{S 1 < L tV ^ | W(a) > W(a-) = (x, y)} 
= p[s t < y^/z(l + a 2 ) | W(a) > W(a-) = (x.y)} 

2 rVy/ z(l+a 2 ) 

= 2h i 21 / Udu= Z, 

y 2 {l + a 2 } J 

where we use that ^y 2 {l + a 2 } is the total measure of the jump measure on the line segment of length 
yy/1 + a 2 , connecting (x,y) and (x + ay, 0). This implies that Sf/L 2 has a uniform distribution, in 
accordance with Theorem 2.1 of Nagaev (1995). Moreover, since the distribution neither involves the 
value of a = a* nor that of W(a»_i) , the sequence of variables Sf / L 2 is i.i.d. For the same reason the 
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variable Sf /Lf is independent of Dj, j < i. It is also seen that Sf/Lf is independent of Dj, j > i, 
since the conditional distribution of £>i+i, given W(ai), is standard exponential, independently of 
the value of W(aj). 

Corollary 1. Let the sequences oo, ai, . . . and V(oo), V(ai), ... &e defined as in Theorem 2.1, and 
let Ti — V(ai)/V(ai-i), i = 1,2, Then the sequence of random variables n, t 2 , ... is i.i.d. and 

(1 -t,) 2 ~Uniform(0,l). 

Moreover, the random variables n are independent ofV(a ) — V(l) and t/ie areas D i; w/iere Di is 
defined as in Theorem 2.1. 

Proof. This follows from part (ii) of Theorem 2.1 since 

1 _ T _ i _ y ( a >) _ _si . , 

where the last equality is the proportionality relation, well-known from elementary geometry. 

The following result is the key to Theorem 1.1. 

Corollary 2. Let, for m = 2,3..., N(l,m) be the number of jumps of the process {W(a) : a € 
[l,m]} and let [EN(l,m)] be the largest integer smaller than or equal to EN(l,m). Then: 



EN(l,m) = | logm, 
(ii) As m — > oo the bivariate random variable 

[EN(l,m)] 



{N(l,m)-EN(l,m)}/^/4 r logm, ^ (A - l)/y/EN(l,m) 

i=i , 

converges in distribution to a bivariate normal distribution with expectation zero and covariance 
matrix equal to the identity matrix I. 

Proof, (i). This is part (i) of Theorem 2.4 of Groeneboom (1988), which is a simple consequence 
of the fact that the expected jump rate of the process {W(a) : a > 1} is given by l/(3a). 
(ii). The area Di of the triangle Ti, as defined in Theorem 2.1, is given by: 

Di = §V(aj_i)(V(aj_i) + OiViat-!) - V(a;_i) - a^V^)) = ^(a^O^a, - Oj_i). (2.2) 

Define 

Ui = U(a-i), Vi - V( ai ), and W l = (U t , V t ) ,i = 0, 1, . . . 
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It is clear that (2.2) gives a tridiagonal system for solving aj in terms of the Di and Vi. We get, 
using ao = 1, 

a„ = l + 2V-^, n> 1. (2.3) 

i=i 

We now define, for n > 1, 

^ = KLl | 1 + 2 E ^| = 

Thus, 

log a n = -2 log + log Y n , (2.4) 
and hence we get the "switching relation" : 

N(l,m) > n -i==^> a n < m — 21og V„_i + \ogY n < logm. (2.5) 

By Corollary 1: 

EVl = EV 2 JJt? = 6-^ 2 , ^ ^ = II = ^ > > 0- (2.6) 

i=l \ k / i=k+l 

Since, by Theorem 2.1, the n are also independent of the D i} we obtain, for all k > 1, 

£y„ = e-f"- 1 ^^ 2 + 2 ^ £ ^ = &-^Evi + 2 ^ 6- J 

,•=1 V / 3=1 

00 

< 6-("- 1 ^F 2 + 2^6- j . 
This implies, by Markov's inequality, 

Yn = O p (l), n -> 00. 

Since we also have F„ > 2D„, for all n > 1, where £>„ has a standard exponential distribution, we 
obtain from this: 

|lo g y„| =O p (l), n^oo. (2.7) 



We now get from (2.4): 

loga„-3n -21ogy„_i + logF„ - 3n -2 log V^-i - 3n 



'5n V5n V5n 

as n — ► oo. Moreover, since 



+ 



O p (n" 1 / 2 ) 



71-1 / y. \ n ~ 1 
-2 log V n - X = -2 log ( ) - 2 log V = -2 ]T log r, - 2 log Vb, (2.8) 
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we get by the central limit theorem: 

log a n - 3n _ -2 YJiZi lo S T * ~ 3n 



+ o p (l) AW(0,1), n-> oo, 



v5n v5n 
where A/"(0, 1) denotes the standard normal distribution. 



(2.9) 



Let 



[EJV(l,m)] 

Si(m)= ^ (A - l)/y/EN(l,m), 



and 



B 2 (m) - {AT(1, m) - £iV(l, m)}/^logro, 
and let, for fixed y € M, n = n m ,y G N be defined by: 

n = EN(1, m) + y\J ^ logm , to — > oo. 

Then we find, using (2.5) and (2.9), as to — > oo, 

P{A(m) > a;, B 2 (m) > y} =p|bi(to) > x, N(l,m) > EN(1, m) + yy^dogmj 

~ P{A(m) > x, N(l,m) > n} = P{A(m) > a;, loga„ < logm} 

to / d / > ^ log a„ - 3n log m - 3n 
= P < Bi{m) > x, = < 



(2.10) 



5n 



5n 



-2y"- 1 logr i -3n logm - 3£7JV(l,m) - W| logm 
A(m)>x, ^ l=1 < v 



5n 



P{AM > cc}P 



= P{A(m) > x}I 



-2 El^i 1 n - 3n < log m 



5n 



- 2 E™=i llo g T » - 3n 



y/l lo s TO J 



5n 



where we use part (i), (2.10) and and Corollary 1 (independence of the Ti and the A) in the next 
to last line. Since, by (2.9), 



- 2 E"=i ll °g' r i - 3n 



bn 



< -y } -> $(-?y) = 1 - $(y), 



where $ is the standard normal distributon function, the result now follows. 



3. The central limit theorem 



In this section we prove a 2-dimensional central limit theorem, by combining the results of the 
preceding section with the results in Groeneboom (1988). 
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Theorem 3.1. Let N(a, b) be the number of jumps in the interval [a, b] of the process W , as defined 
in Definition 1, and let D(a, b) be the area of the union of the triangles Ti, corresponding to points 
of jump ai € [a, b], as defined in Theorem 2.1. Then: 

log(6/a))- 1/2 (N(a, b) - \ log(6/a), D(a, b) - § log(6/a)) A N(0, E), b/a -> oo, 

where N(0, S) is a bivariate normal distribution with expectation and covariance matrix defined 
by 

E -(U) 

Proof. As shown by the transformation to a stationary process (2.27) in Groeneboom (1988), 
the distribution of N(a,b) only depends on the ratio 6/a. The same construction shows that the 
distribution of D(a, b) only depends on the ratio b/a. So we only have to prove the result for a = 1 
and b > 1. 

We know from Theorem 2.4 in Groeneboom (1988) that EN(l,a) = -|loga and var(iV(l,a)) 
~ (5/27) log a, as a — > oo. Moreover, 

D(l,o)= A= 2 area ( T »)' 

a»e[l,a] a<€[l,a] 

where the 7$ are the triangles of Theorem 2.1. So we can consider D(l,a) as a random sum of 
standard exponential random variables, where the number of terms in the sum is equal to the 
random variable N(a, b). Reasoning heuristically, as in the case of a compound Poisson distribution, 
we would get 

E(D(1, a)) = EN {I, a) = \ log a, 

and 

vax(D(l,o)) = EN(l,a)+var(N(l,a)) ~ | logo + ^ logo = if logo. 

We now show that we can prove the result by using this heuristic idea. 

We write D(l, a) — ^ loga as the sum of the terms Ai(a) and A 2 (a), where 

[EN(1m)] 

Ai(a)= A- glog o, 

»=i 

defining [EN (I, a)] as the largest integer not exceeding EN(l,a) = |loga, and 

,JV(l,a) 



A 2 (a) = < 



£i=[iw(i,a)]+i A, XN(l,a) > [EN(l,a)} 
k - ES^+i A, if AT(1, a) < a)]. 
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We now have, if N(l, a) > [EN(1, a) 



N(1m) N(l,a) 

E D i= E (A - 1) + N(l,a) - [EN(1, a)], 

i=[EN(l,a)]+l i=[EN(l,a)]+l 



and similarly, if N(l,a) < [EN(l,a)] : 

[EN(l,a)] 

- E a = - 

i=JV(l,a)+l 



[SJV(l,a)] 

E (Di-l) + N(l,a)-[EN(l,a)\, 

i=N(l,a) + l 



where both sides are zero if N(l, a) = [EN(l,a)]. Hence we can write: 

D(l,a) - |loga = Ai(a) +N(l,a) - [EN(l,a)]+ R(a), 

where 



R(a) = < 



ESi (M]+1 (A - 1), if N(l,a) > [EN(l,a)] 



{ -ESlSViCA - 1), if N(l,a) < [EN(l,a)}. 
Fix e > 0. By Theorem 2.4 in Groeneboom (1988) there exists anM = M(e) > and an 
a = a (M) so that 

_ f JV(l,a)-[2iW(l,a)] 



> M ^ < e, a > a . 



Define 



rc_(a) = [£W(l,a)] - M0oga, n+(a) = [EN(1, a)] + My/\oga . 
Then, by Doob's inequality, 



max 

me[[EN(l,a)} + l,n+(a)] 



+ - 



max 

me[n-(a),[EN(l,a)]] 



E (A-i) 

»=[BJV(l,o)] 

EJV(l,o)] 

E (A-i) 



> e^loga 



> e^/loga 



n+(a) - n_(a) + 1 2M 
£^(loga) e 2 Vl°ga 

These relations imply: R(a) / \/log a — o p (l), a — > oo, and hence: 

D(l,a)-EN(l,a) _E l Zi {1 ' a)] (D i -l) , a) - [£W(1, a) 



+ 



^/\oga Vl°g a Vl°g a 

The result now follows from Corollary 2 and Theorem 2.4 in Groeneboom (1988). 



+ Op(l). (3.2) 



Using the methods from Groeneboom (1988) in going from the Poisson approximation to the 
sample process, one can now easily deduce the central limit result Theorem 1.1 from Theorem 3.1. 
The latter method is also used in Nagaev and Khamdamov (1991). 
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Remark 3.1. Instead of working directly with relation (2.2), expressing the differences between 
successive slopes of the convex hull in terms of the area of the corresponding rectangle and the 
y-coordinate of vertex at the intersection of the line segments with these slopes, Nagaev and 
Khamdamov (1991) write this relation first in the following form: 



and then deduce a recursive relation for the U(a,i) in terms of the V(ai) and Di from this. They 
then define the random time 



and consider sums of the form J2iZiDi- This seems to lead to more complicated proofs. 

Remark 3.2. The scaling constants for the central limit theorem for the area in Cabo and Groene- 
boom (1994) are not correct, although a correct application of the methods used in that paper 
would lead to the central limit theorem for the area, which is part of the central limit theorem 1.1 
above. We here tried to present the results of the unpublished preprint Nagaev and Khamdamov 
(1991) in an easily understandable way, where the presentation is considerably simplified by the 
use of martingales, Doob's inequality and the results from Groeneboom (1988). In view of this 
simpler approach, and also the fact that Theorem 1.1 is in fact a stronger (2-dimensional) result, 
this approach seems preferable to the approach in Cabo and Groeneboom (1994). On the other 
hand, the computations along the lines of Cabo and Groeneboom (1994) give precise information 
on the first and second moments, as shown below in section 4. 

Although Nagaev (1995) hints at the proof of the central limit theorem 1.1, there are many 
important missing steps, which have to be traced down to the unpublished preprint Nagaev and 
Khamdamov (1991). It seems fair to say that without knowledge of this preprint, deducing the result 
from Nagaev (1995) is pretty hard. Moreover, Nagaev (1995) contains in the crucial relation (3.7) 
an incorrect scaling constant (the constant 5/4 there should be 20/27), which further complicates 
the derivation of Theorem 1.1. For this reason we gave a simplified and self-contained treatment 
above. 

Remark 3.3. Buchta (2005) gives the following relation between the sample variances of N n and 
A n (using the notation of Theorem 1.1): 




(3.3) 



9 T =inf{z : U( ai ) > T} , 



(n + l)(n + 2)var(A») 
n 2 



= var(iV„) +d n+2 , 



where 



d n = (EN n f - 



n(EN n _!) 2 



- (2n - l)EN n + 2nEN n _ 1 



EN n ~ |var(7V„), n -> oo. 



n-l 
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Hence 

var(A„) ~ ^var(iV„), n -> oo, 

in accordance with the covariance matrix E in Theorem 1 in Nagaev and Khamdamov (1991) 
(Theorem 1.1 above). Note that the split- up of the variance of A n corresponds to the split- up 
(3.2) above, where <i„+2 corresponds to the variance of the exponentials £j in (3.2) and var(iV n ) 
corresponds to the variance of the second term on the right-hand side of (3.2). 

Theorem 2 of Buchta (2003) gives for the number of vertices N n of the convex hull of the points 
(0, 1), (1, 0) and Pi, ... , P n , where P\,. . . ,P n is a uniform sample from the interior of the triangle 
with vertices (0,0), (0,1) and (1,0): 




and 



This gives: 



C n n 

I 1=1 2=1 



EN n ~ | logn, var (N n ) ~ ^ logn, n -> oo, (3.4) 

which corresponds to the distribution results derived in Groeneboom (1988), as is also noted in 
Buchta (2003). 

The results in Groeneboom (1988) and Nagaev and Khamdamov (1991) only imply that one gets 
a normal limit distribution for the number of vertices of the convex hull of a uniform sample from the 
interior of a convex polygon with r vertices by centering with |r logn and dividing by (|^r logn) 1 / 2 . 
It is not proved there that the variance of the number of vertices itself is also of order log n. In 
principle one could have a central limit theorem where the scaling needed to get the central limit 
result is different from what one gets from the actual variance. 

However, the only thing that still seems needed to go from (3.4) to the result that the variance 
itself is also of order logn seems the appropriate use of the independence of what happens in the 
corners of the polygons, so that one can conclude that the variance is the sum of the variances of 
the number of vertices in these corners. Moreover, one has to go from what happens in the triangle 
to what happens in the corners of the polygon. This is the subject of current research by Buchta. 
Results for higher moments of the convex hull of a uniform sample from triangle with vertices (0, 0), 
(0, 1) and (1,0) are given in Buchta (2011). 
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4. Simulations 

Let N(a, b) and D(a, b) be defined as in Theorem 3.1. The distribution of these random variables 
only depends on the ratio b/a and in this section we present some simulation results for these random 
variables, taking a = 1 and replacing b by a. 

The algorithm, given in section 4 of Nagaev (1995), was used to simulate part of the boundary 
of the convex hull of a Poisson process with intensity 1 in the first quadrant. The starting triangle 
is bounded by the x-axis, y-axis and a line of the form x + y = c, where c > 0. Its area D has a 
standard exponential distribution and the point W(l) is uniformly distributed on the line segment 
which is the hypotenuse of this triangle. 

With the algorithm of Nagaev (1995) we can now generate the points W(a), a > 1, and simulate 
in this way the distribution of N(l,a) and D(l,a). We start with N(l,a) and recall the exact 
expressions for the expectation EN(l,a) and var(iV(l,a)) from Groeneboom (1988), Theorem 2.4: 

EN(l,a) = | log a, (4.1) 

and 

, , ^ 5 4 , w , ss2 8 I tan -1 {-Ja — l) I 

var(iV(l, a )) = -loga+-(tan- 1 (V^T)) ^== (4.2) 

As noted on top of page 34 in Cabo and Groeneboom (1994), the formula for the variance of (iV(l, a)), 
given in Theorem 2.1 of Groeneboom (1988) contained a typo (the argument of the first tan -1 above 
was a instead of y/a — 1), and the correct formula is in fact given on p. 365 of Groeneboom (1988) 
(which we use here). Note that these are exact expressions for EN(l,a) and var(iV(l,a)) and not 
asymptotic ones. 

The following table shows the means and variances for 10,000 simulations for the values log a = 
10, 50 and 100. The exact values are given in 4 decimals accuracy. 

Table 1. Comparison of EN(l,a) and Var(iV(l,a)) with simulated and asymptotic values. 



logo 


simulated 


exact 


simulated 


exact 


asymptotic 




EN(l,a) 


EN(l,a) 


Var(JV(l,o)) 


Var(JV(l,a)) 


Var(JV(l,o)) 


10 


3.3519 


3.3333 


2.1193 


2.0596 


1.8519 


50 


16.6668 


16.6667 


9.5908 


9.4670 


9.2593 


100 


33.4259 


33.3333 


18.7039 


18.7263 


18.5185 



It is seen from Table 1 that EN(l,a) and Var(iV(l,a)) are quite close to the simulated values 
and that, not unexpectedly, for a = 10 the exact expression for the variance of N(l,a), given by 
(4.2), is closer to the simulated value than the asymptotic value. 
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We similarly did 10,000 simulations for the values log a = 10, 50 and 100 to simulate the behavior 
of D(l, a). Using the (corrected) methods of computation of Cabo and Groeneboom (1994) (details 
are given in Groeneboom (2011b)), it can be shown that 



ED(l,a) = iloga, 



and, defining a = a — 1, that: 
var(£>(l,a)) 

L 3a 2 ' 9a 45 



14, 2 4 44 2{3 + a(3-4a)}tan- 1 (Va) 4, 



These are again exact expressions for ED(l,a) and var(_D(l,a)) and not asymptotic ones. We get 
the following results. 

Table 2. Comparison of ED(l,a) and Var(_D(l, a)) with simulated and asymptotic values. 



log a 


simulated 


exact 


simulated 


exact 


asymptotic 




ED(l,a) 


ED(l,a) 


Var(£>(l,a)) 


Vax(D(l,a)) 


Var(£>(l,a)) 


10 


3.3664 


3.3333 


5.4089 


5.3040 


5.1852 


50 


16.6576 


16.6667 


26.1452 


26.0448 


25.9259 


100 


33.4933 


33.3333 


52.3304 


51.9707 


51.8519 



We finally turn our attention to relation (3.7) in Nagaev (1995). This relation gives asymptotic 
expressions for the expectation and variance of the number v t of vertices falling in a disk St with 
radius t and center (0,0). On the basis of the results in Groeneboom (1988), it is to be expected 
that 

Ei/ t ~flogi, var(i/ t )~ §logt, t ^ oo, (4.3) 

whereas relation (3.7) in Nagaev (1995) gives the above relation for Ev t , but (5/4) log t as the 
asymptotic expression for var (i/ t ). The argument for (4.3) is that, first of all, u t can be expected to 
behave asymptotically as the number of vertices with coordinates x > y such that x < t plus the 
number of vertices with coordinates y > x such that y < t, since vertices with large ^-coordinates 
will with high probability be very close to the x-axis and vertices with large y-coordinates will with 
high probability be very close to the y-axis. Secondly, again by Groeneboom (1988), the number 
of vertices with coordinates x > y such that x < t will behave asymptotically as N(l,t 2 ), and 
similarly, the number of vertices with coordinates y > x such that y < t will behave asymptotically 
as N(l/t 2 ,l). 
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By the construction of the algorithm in Nagaev (1995), we can simulate the number of vertices 
W(a), a > 1, satisfying U(a) 2 + V(a) 2 < t 2 , by running the algorithm till we get a vertex W(a) such 
that 

U(af + V(af >t 2 . 

The resulting asymptotic behavior oiEv t and Var(^) is obtained from this by multiplying the results 
by the factor 2. The table below shows the result for 10,000 simulations for the values logt = 10, 50 
and 100. 

Table 3. Comparison of Ev t and Var(i/ t ) with simulated and asymptotic values. 



logi 


simulated 


exact 


simulated 


(20/27) logi 


(5/4) log t 




Ev t 


Ev t 


Var(i/t) 






10 


13.0778 


13.3333 


7.2630 


7.40741 


12.5 


50 


66.4792 


66.6667 


37.6192 


37.0370 


62.5 


100 


133.1330 


133.3333 


74.542 


74.0741 


125 



Table 3 clearly suggests that the factor 5/4 is much too large and that the correct approximation 
is indeed given by (4.3) above. 

5. Concluding remarks 

There is a remarkable analogy between the behavior of the left-lower convex hull of the Poisson 
point process, discussed above, and the least concave majorant of (one-sided) Brownian motion 
without drift, as analyzed in Groeneboom (1983). In the same way there is an analogy between 
the behavior of the lower convex hull of the Poisson point process inside a parabola, as analyzed in 
Groeneboom (1988) and Nagaev (1995), and the least concave majorant of Brownian motion with 
a parabolic drift, as studied in Groeneboom (1989) and Groeneboom (2011a). Why this is the case 
is still somewhat of a mystery and deserves (in my view) further investigation. 
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